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Ihe folloTilng w<a* deals wL-tti itoe control of data errors 
In information systems After a lorief stsvey, the Inadeqnaoy of * 

I 

the exLStuig methods Is established It Is snbmitted that in many A, 
application axeasi mtomatio error correction Is e8sentlal^ ^ novel 
single and exchange error correcting code is denre loped A quanti- 

tative methodf based on a model| Is given^ to choose from amongst 


the data entry methods y 

/t 

Vextf ^e problem of secrecy transformations in system 
do ^gti is considered Some simple algorithms axe presented) a method 
of oomplioatlBg th^ is indloated) together mith a quantitative method 
of enraluatlng them 


^Finally, 

concepts 


a live case is taken up to lllnstrate the alnve 
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IN!EBOiniOa!IOH 

A oharaotexi stio luvexiab3y found In all infonnatlon ayatems 
la that the volume of data la laxge and the pcooessing xeiiuixed on 
the date is airnple Uaually ahonit QO^ of the woxk done on the 
compitex la fox Iniut editing, pcooeduxe control) output editing end 
fonaattlng A laxge pant of this la to xeduoe the effect of data 
exxoxa on pxooesaing 

Inaplte of its great Importance, thia prohlea haa not xeoelved 
adequate attention, and as a result, Input data editing la, at present, 
very simple and largely inadequate Usually several oheoka are made 
In every system, hut each of the oheoka is only a neoesaaxy condition 
for the oorreotnesa of data, and none of them Isa sufficient condition 
Gonseq.uently, inspLte of the moat elehorate editing prooeduree, a number 
of errors still persist in the data input to the pscoaaaor All the 
errore whioh are detected have to he oorreoted manually, a prooess 

which involves a large mount of deley Also, if a hatch of data is 
large, then there oould he errore in the ooxxeoting of errors detected 
prevlcusly, and henoe, the pcoohlem heeomes more severe 

She present work mainly deals i^th Ihe above problem In 
Chapter 2, the eadL sting data error uontrol methods sre discussed lEhe 
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main method is editing This iiu3 hides error detecting which is 
po ssihle hy proper coding, sane properti es of individual fields, and 
intezf ield relation^ps Batch controls and cross-footing checks 

also permit some error detection But the moat interesting method 
is hy the Inolueion of check digits i^ter analysing such methods, 
the need for error-correcting Is hrou^t out 

Chapter 3 presents a proposed error oorreotlng code The 
method IS desoxibed, proofs axe given, followed by some comments on 
the algorithm 

Chapter 4 oomparee ths data entry method using this error- 
oorreotlng oode, with some of the well-knowa existing methods The 
compansion is effected by ms ans of a simple model developed for 
this purpose, eod results are presented 

Chepter 5 oonsidexs seoreoy transfomatione, a topic not 
quite related with data eircor controls Some of the data in data 
bases is secret and has to be proteoted I& soma applioatlons secret 
data has to be handled manually, end it becomes necessary to oode It 
This problem is analysed, sons simple algorithms are given, and a 
quantltativemmOiod of evaluating the algorithms Is developed 

Chapter 6 takes up a oase study whioh, apart from being live, 
is well suited for the application of the concepts developed herein 
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Chapter 7 is ocnolualon Apart from dxaouaaliig what 
haa heen aoho.ev'ed. It alao reoommmda the appHoatlon areas In lAiloh 
the la thoda developed oan he profitably employed 

A oomi}rehenalve hlbliograiby la given for ihe interested 
research workers 



CHAPTER 2 


Exrsimfg DATA ERROR OOimOL MECTiODS 


2 1 OvegfvlCTr i*- 

lEhls ohapter bcLefly xevlews the existing data enox oontacol 
methods All of liiem axe In the fom of obeoks Sons of the oheoks 
axe natoxalf and others are artifloially lutroduoed Examples of 
natural oheoks are ihe validity checks possible in significant digit 
codes, limit oheoks on fields, lntex*€leld relationships, eto Examples 
of artificial oheoks are record count, control total, oroa8*footing 
oheoks, eto Ohe main contention is that these oheoks are too naive 
and elementary The only interesting oheok is the modulus 11 oheok 
digit Thou^ these are useful to point out certain errors, it is 
pointed out that tiiey axe in no way complete, and errors can still 
persist SoienUfio investigation Is xeq,ulxed here In partioulax, a 
simple error ocscxeoting code is required whLoh is acceptable fox 
Infoxnabion eyetems 

2 2 Editing *•* 

Editing la a procedure in which all possible checks axe made 
on input data, to point out as maiy errors as possible The oheoks 
used at present axe 
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1) Beoord oount 

2 ) Control totals 

5) ft:oof figures 

4) Type oheoks 

5) lojnit oheoks 

6) Inter field oheoks 

7) Cxoss*^ooting checks 
s) Tape and disk labels 
9) Oheok point and restart 
10} Sequence oheok 

Beoord count Is Ihe number of xeoords In a batoh It is found 
manually y and put as a special record after the batoh The number of 
reoorls read are counted and the totalis tallied with this reoord oount 
The aim is to discover whether there are omisalouB or duplLoatione of 
records Control totals can housed for nuns rlc fields The same field 
in all the reooids of a batoh are added to get a control totaly whether 
the total mdcea sense or not \?hile reading, the oomputor also finds 
this total end tallies it vd.th the total fed in Proof figure Isa 
number carried with a field which la a numsric value, such that thd.r 
sum adda up to a ocnatant decided upon previously Typo checks involve 
oheoking the characters of a field which should be either purely 
nvmexlo, or purely alphabetio limit ohscdcs can be perfomed on numecio 
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fields Usually the quantity In any field oan He only in some rangSi 
and this check sees whether it aotuatLy does Interfield oheoks oan 
point out and even correot some checks For instance if Identification 
oodes lie in some ranges for group A, and some other ranges for 
group B (vdxatever ihat means )i and if a batch of transaction reooods 
is fed in which all Identifi cation codes are present in Inoreaslng 
order, then If it is found that the group is mentioned B and Idle 
oode Hes in the Arrange, then the group oan ha safely corrected 
Csrass-footlng oheoks essentially Involve doing an operation in two 
ways and tallying the result Tape and did;: labels sxe usefhl to 
check whether the correct hatch is being bxou^t in, and they also 
contain some other Infomation Check point and restart la actually 
not a part of editing, thqjr are points in a program where the processing 
is proved and enough information is stored to restart processing at 
that point 

2 3 Error detecting codes i- 

The modulo 11 oheck digit sohems is the only error detecting 
oode worth mentioning But before desoxlhing it, Ihe various errors 
possible will be mentioned Let a number be 

H - nj_ 

where eai^ oan be 0,1,2, ,9 (ErtensLon to alsftianumexio oodes is 

simple), 03ie '’channels’ whicdi oan produoe errors are reading and 
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copying, reading and pinohlng, etc (Qie errors are 

1 Single tranaoriptxon error am ‘beconea n^* 

2 Single transposition error and beoame n^ and 

3 l"Qltiple identical adjacent digit tranaoriptlon error n^, n^ 

V7hioh are all eq.-ual to x ‘become y y y 

4 Zero shift errors !Ifhe num'ber of adjacent zeroes gets changed 

In addition, there are several kinds of multiple errors 
ISoduhis K check digits are of tro types In the first typOj 
a oheok digit n^^^ is added euoh that 

kt-1 

S ^ (modulo N) 

1 

vhere is a associated vrith the ith digit In the second * 1 ^ 0 , 

the weieh't^s are 1,10,10^, eto , so thab the number Itself, when divided 
by K gives the oheok digit Systems of the second type can be reduced 
to Ihe first type, so thab it is enough to oonslder only tixese 

fhe problem is to find N and the weights so that most of 
the errors can be detected 


Sinp l atransorintion errors *- 
Here n^ becomes n^< 



p* 8 'irtdoh should 


be zero» beoomes (n^* !Ihe oheoking involves seeing If S Is 

zero So ttaxa errccc oan go undetected if 
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This oon he avoided if la^ < W, \ 4 ® ^ ^ ® prlTBC 

Stable transTOsltton errors ~ 

Here axu get xntexohanged So S becomes \ 

arhioh oan go undetected if 


(”i" ^i^ “ ^ 1“0,1,2,5» 


and it oan be eivoided If N, no two wei^ta axe 6q.ual» 

and H is prime 

IdentLoal addaoent digit tranaoription errors - 


Here i Vl which are all eq.ual to x become yy 

so i^at S becomes 

(y - x) ^ wp 

This oan go undetected if 

, i 

(y*) ^ ^ ^ “ 1»2}3i 


d 

So to ayoid this^ it Is enou^ if n.< N, w. / IN, 1 ■ 1|2,3» 

^ i * 


for d if N i s prime 
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So these errors can be detected 100 9^ But for other 

errors^ simple analysis is not possible However, asauntxag eq,ual 
probability, (N-l)/ N of them can be detected 

B E BeokleJ^in I967 gore -the f ollovrmg weight s* with N «• 11 


Wi « 1, Wg ■ 2, Wj - 5, » 3} Wj •• 6, v/g " 4» » 8, Wg « 7» 

- I0f ^ 


In 1969 1 m developed the theory ae given above 

In addition to N ■* 11, another suggestion was IT » 97 which requires 
two oheok digits 


![!he moduhie 11 oheok digit v/hioh enables 


^ 1 

n . w. «• 0 

Li **Jii 


is 



Now it is possible that ih iiddch oan it oannot be aooomodated 

in one deolmal digit 

One way to overcome this is to disoaxd all members vftiioh give 
a check digit of 10 OMs osti be done for most of the systems in the 
design phase, tut not for systems already designed Another way is to 
use some Special symbol like A for 10 But this oreates p3x>blana in 
editing I) V A Campbell |llQ oame up with a solntaoh to Ihis in 197® 



He observed that Heckley's wel^^t sequenoe is only one of the possible 
ones He genre anothec sequence where w^ «• 1» 
w| « 8, Wj» » 7, wg» » 5, w^» - 4» Wq> - 4» w» « 3 and w^q » 10 
Aooordlng to him if a number genre a oheok digit of 10 with the weights 
W, It ^Aiuld not: , with wei^ts W So whatever a number genre a oheok 
digit of 10, the ¥' system was to be used In the oheoklng algorithm. 
If S n 1, then S was to be recalculated using RW Broderick 
and C I Reid La6J pointed out some defects in this systems Ror 
inabanoe, IOO 3 and 180000000 give a oheok digit of 10 in both the 
systems Reid proved that it is suffioient to have (modulo 11) 

for all 1 except the wel^t applicable to the oheok digit and its 
arithmobio inverse modulo 11, for whioh, w^^^* « Wj^ 


The next Inportanb paper was by A M Andrew in 197 vbo gave 


a variant of the modulus 11 scheme, in whioh a oheok digit of 10 
cannot ooour Instead of evaluating \ 2— i 

Li ^^Jii' 


he proposed 




But this oannot detect all the errors the original aysten oan 

T Briggf^jin 197^^ analysed all the woiic done so far, and 
brought cut some practloal aspects* He found a troe^iagram method 
of generating the Vfi-ld wei^t sequenoe, and modified Omnpbellt s method, 
r^oving Reidts objeoticai 
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Extenalon of the modulus 11 sohane to alihanumexlo data is 
straLghtfoxn^axd It Is also possible to use the oharaoter oode of any 
oomputer instead of assigning seq.uential ntunbers to the ohacaotexs 

2 4 Q he need for error ooriectiag : 

13ie check digit system de^rihed above only detects most of 
the erjDrs in input data Corrections have to he made manually i ^ 
that extent it Is just another edit procedure Ibis is time ooneuming 
and if ihe data Is very volumiXLous, it oould Involve going several 
times "around the loop ' Xhls does not enable maintaining an up-to- 
date file) and consequently results m bad management infomatlon If 
the error rate Is hi£^y the effect oould be disastrous and credibility 
on the oomputer mcy be lost^ 

'What is needed is a simple way of automatically eorreoting 
most} if not all, of the errors Ibis would make tbe system very 
efficient But methods similar to those in coding theory oannot be usedi 
einoe they would Involve a tremendous overhead of ohsok digit B| end 
consequently will not be accepted A little extra computer time can, 
however, be tolerated 

Ihe next ohapber proposes one error oorreoting eode meeting 


the above s^equlroaente 
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APEETOIX 


A simple maaaal method of oaloulatlMfl^ oheok dlgLta 


For large amounts of data generated manually for whioh it 
is feasil}le to add eheok digitsi It is oumberBome and error-prone to 
oaloulete them for e aoh number using the fomula 


Check digit m 



A praotloal method is presented here for N » 11 It holds for exQT 
K It uses the result 



It oonsists of tm> tables Table 1 gives the valuea of 

any 1 They are to be added manually The sum, when referred 

Table 2, gives the oheok digit 


for 

to 


Beokley* a weights have be oho sen hero t 


”1 *2 ”3 *4 *5 *S *7 \ *9 *10 

12536487 10 9 
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Let ®lc minTjer, and it is required to flni 


the 


ohsok digit each that 


and -m 




Of wi-tii the vreights as giveuy 


'fcn 


Algorithm - 

1 look up Table 1 Under the digit 1 oolumtif look up for the 
number oorcespondlng to Similarly look up the number for 
the other digits and add them up to give sum 

2 look up Table 2f locate the sum and read the oorresponding oheok 
digit 

If it is required to dheok manually whether a number is 
ocxrrooty the same procedure can be applied » with the difference that 
instead of going to Table 2, it is enough to ohedc if the sum is 
0, 11 , 22, 33 , eto 
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A EB0IO3ED BHBQB OOBHEOmG CODE 


3 1 O verflew t- 

!Ehe need for eutomatio error ooxreo^on was establxahed in 
the prevxoa 8 chapter (x It Welhbec^|i|.n extended Hamming^ s 

single error coxreotxng code to decimal memhexa It was a stxai^iit- 
forward extension using modulo 10 arithmetic instead of ihe moctulo 2 
arlthmetio of Hamming It could oorreot all single erxorsy hut It 
required k check digits fox m information digits, suoh that 
2 + k >)■ 1 7or example, for 4 In&zmation digits, ^ check 

digits were required Ihe method was not free of pitfalls After 
this, no further sork was done in this area 

Ihe pxopo sed error oorxecting cods requires only two extra 
oheok digits irrespeotlve of the number of Infomatlon digits, and 
it ooxreots all single errors (i e , dngle txansonptian errors) 
eoA. all single transposition errors, which together account fox about 
of all the errors Tixet the melhod is presented Ihen the 
proofs are given end -the theory underlining the method is developed 
Finally, -the algorlihin is discussed 


5 2 Jhe method t- 

let -the number to be coded be K « n^, n 2 > \ 

where i 0, 1,2, , 9 ] fori <1 1,2, , k 


"k 
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Prooedure for ooding i- 

Let a weight sequenoe 

ir - fr, i>2 V2^ 

he^ for k * Qf 

46 10 937 8521 

and. ano-Qier aequanoe 

W . (»,■ Wj- ) 

he5 10 4896 1732 

Two oheok digits and are to he appended to N, 

given hy 


r 

k 

Vi^ + 

k 


Vi ■ f 

1 

1 

“iV ill 


k 



fc 1 

V2 ■ If' 

[-r 

1 


3( 

1 ^ ^ J11 

The eoded numher 

is 





- “1 "2 Vi Va 

Rcooedare for error deteoting *~ 
Oheok whether 




If yes| the mmher is erzoi? free Otherwise, call the ;^ooedure for 
error ocrreoting 

Prooedure for eripr oorieoting 

1 Oompite 



2 Consider each digit from 1 to k-i-2 For each, insert all the other 
9 posBlhle digits At erver^r insersion, oompite If 8^ » 0 

at some Insersion, oompite ^2 '* error was a 

single traneoription error which has heen oorrected Exit Else 
oontinue, replaoing the original digit in Its plaoe after all tho 
other 9 digits have heen inserted unsacoessfUlly 

3 Exchange the first and second digits Compute If > 0, 

oompute Sg If ■■ 0, it was a single transposition error #iloh 
has heen oorreoted Exit Otherwise put haok the digits Eext 
exchange the second and third digits, and so on 

4 Repeat (3)> hut now exchange alteimate digits 

5 Report "an error which oannot he oorreoted” Exit 
4 3 Rgoofst- 


!Qie wel^t sequences 
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are ehDsen so as to individually satisfy the Seq.ui 3 f@iits)^t 8 of the 
modulus 11 error deteotlng so heme explained in the previous chapter 
So thqy detect the sane errors Mdltlonal constraints will now be 
placed on W and W to ef-f'eot correotlon 

Single transcription errois - 
Now 



If thejM is no error, S^ « 0 and ■ 0 3>ie to a single tran8g9i||}** 
tlon error, if the digit jr^ becomes , then 

®2 " W'" ”lH‘ 

It Is enou^ to prove that there is a unique way of oorreoting this 
error, that is, a unique Kay of maJcing both and zero Obviously 
there Is one way, namely hy changing to n^^ (To prove that 
there is no other way, consider any otoer change, say n^ ^ 

GMs results in the addition of 
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Si< . (nj'-njVj 

to ani Sg If this diould not make and Sg both zerO| then 
it 3. a iiaciffiolcarb if 



That lai 

(nj' 

and 

eAiaald not hold slnultaoeously If one of then holds, the olAieci; 
diould not So dividing one by the olixeXf 




t,j » 1,2, 

1)^ a 


, lot-2 


should not holcU This la therefore a sufflelent condition fox the 
unlqnfi oorxectability of all single tranaoription errors. 


Single transTiosltlon errors* - 


If due to this error, ttiro digits n^ and n^ get intero hanged, 


^ (n^ - »jj) (wj- wj,), 

h- (i^-»j)(w3> - V) 


then 



21 


One 17 ay of oorreotlng this is to interchange again n^ and n^ To 
shovr that there la no other way to make and 8^ both zeror 
consider any other Interohangey say of digits and n^ This 
adds 

Si’ « \) Wp), 

^2 "■ ^*p“ "^p' ^ 

±■3 Si and $2 ^or unique correct ability) as heforoy the condition is 
not ^1 "Si * 

That is 

Wj,) 

and 

(ni- n^)(w^'-w^‘ ) « - (n^- n^) (w^^'- w^) 

should not hold slnnltaneousily I{enoey.aa heforey a sufficient condition 
is 

Wj'-Wi' ’"’a* 

w. wj ^ - 1»3» t 

1 3 P q. 

and i / d and p / g. 

Bit usually single transposition errors are for adjacent and alternate 
digits Hence it is sufficient if 
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\ " Vi ^ - Vi ' 

V2 / ^.i'^ V2 

\ ■■ V2 " V2 

1 , J • 1f2, f 1&|'2 and l / a 

It xemcins to "be shown that we}.^t seauenoes W anl V exist as 
required abo-ve Instead of proving^ Vf end W have been found hy 
trial eni error Ihat thqjr satisfy the required conditions can he 
seen from the folloyrlng table 
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Galoulaiaon of ohepk digitst - 


Th© oheok digit© 


V2 


have to be fouind satisfying 




Vi 


0 ( 11 ) 


Olhe exiatenoe of a unique solution is to be proved and the sc^ution 
is to be found This is possible by using number theory^ espeoially 
the V03it of Suler on Diophantine equations£2Sl| 

The following theorems and definitions are neeessaxyt 

Hheorem 1 - Modulo any integer mf 

a 5 b b « a b i-asO^^a-baO 

Definition 1 - If x > b(m), then h is a residue of x modulo m 
If 0 ^ b ^ iBf then ]> is a le ast positive residue 
Deflnitlpp Si- A set of positive integie;r8 Is a ocotplete set of 
residues modulo m^ If no two of them axe oongruentf axid every integer ia 
oongcuent to one of them 

The set ^0, 1y , m-l^ is a oomplete set of least positive 
residues modulo m 

IHieorem 2 1” Modulo any Integer Ut 
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1) a « b oa It ©"b 

2) aw‘bfO«d;i^a + © wb + d 

ar + ©a j5 br + ds 
r^ao » bd 

Definition 5 “ (m, o) is the god of m and c 

Theorem 3 " 

oa e ob(m) a « b(m/(m, o)) 

Oorollary 1 1 - 

oa a ob(m)i (o,in) » 4 a « b(m) 


Definition 4 - A residue elaas A -^a ^ a m r(m) j is a prime 
residue class If (x^m) • 1 


Defi nition 5 
suoh that 


A complete set of prime residues is a set 3 



2) r 6- S (r,m) « 1 

3 ) (a,m) « 1 :^3r S 3 a - x(m) 


If in eddltlont 3ii| Sisa i;eciuoed set of least 

podtlve residues 

Definition 6i- The Salex^s ^•^notion is the numbeo; of positive 


integers not emoeeding whloh ere also opprime to n> i«e » 
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o<r<ni 

Eg f /(i) - If " i» A4) " 2, Ai"*) ** 

Theorem 4 “ (Baler) 

(a,m) » 1 -4 

Pro of 1 “ 

Let * 2 ^ » *1e icdttoed residues modulo m 

Then a r^i a- *2' » ®’ ®k ^ reduced set of reaLduea if 

(aim) m 1 

Hence each of r^, , is ocngruent to some one of 

a®1> multipljrlng, 

V 2 ^ * 1*^2 

But (r^rg xjj., m) - 1 sinoe ( 3^1 m) « 1 for all i So cancelling , 
1 « a^(m) 

But k » A®) ^ assumption Hence 
. 1(in) 

Q B B 

®xaor^ 3t - If (a,m) - 1, then ax - h(m) has auniqjie solution 


modulo m 
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Proof » - By Theorem 4, 

(a,m) e 1 5 l(m) 

Henoe a % 4(m) 

So Is a solution of ay ; l(m) Hiltlplying hoth sides 

by h, and lotting y » b y , 
a by « b(jn) 

so that X a by ba^™^*"^ is a solution of ax ■ b(m) 

To show that this solution is unique, let and be two 
solutions Then 

ax, B “bf a3^ - IJ ^ x^) « 0(m) ^ Vi. J 

Now slnoe (a,m) * 1, m | Henoe x^ ^ 

So the solutron is unique modulo m 

Q S B 

Consider now 

a^x + b^y - o^(m) 

agX + b^ ; OgCm) 

Thearems 1 and 2 enable Cramer’s rale to be applied So, if 
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ve hare 


Dx s (m) 

Dy « Il 2 (ro) 

If (B) m) ■ 1| then Theoxem 3 Is applicable} and 

X - B<| (a) 

y - Bg (a) 

V/lth the above baok^ound 

k+2 , 

r vi s 

fef2 

^ Vi' = 

oan be solved to get and n^^g 


BearrengLngy 

k 

Vi Vi + Vs Vs * ' * Vi 

k 

Vi' Vi * Vs' Vs = " f 

Here 


Vi 

Vs 

'i+l 
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4 Vi 
-^Vi' 


w. 




w, 


w. 


k+1 


IH-Z 


Ic 

£Vi 

1 

£Vi1 

1 


So from 


® "ton • ”1 


® " k +2 ' ®2 


by theorem 5» 


"to.1 . - »i 

V2 


A trlok offla Tie employed here to make B *» 1, eo that we have 


” k ».1 - *'1 


"tot.2*®2 

lor -Hais it is only neooesaey to set Wj^^ 


2, w. 


k-f2 


"tofi 


- 3 


said ts |^2 " 2 I so liiat 


2 1 
5 2 


- 1 



29 


So, finally* 

This seems to 1>e the easiest method to find iixe oheok di^ts Xf the 
summations are to he found manually) the teohnlque given in the last 
chapter can he used 

3 4 Dlaouseion of the algoalthm * 

This algorithm is oompletely different frcm all the error 
oorreoting oodes It does not generate a syndrome vhioh all the 
oiher algorithms do It requires two and only two oheok dsglts 
Irrespeotivs of the numher of information digits, whereas in all the 
other oodes, the mmher of oheok digits inoreasea with 1he numher of 
information digits It oorreots two oompletely different -^pes of 
errors whereas most of ihe oodes oan ooxxeot only one type of error 

Ihe reason why just two oheok digits axre suffloient is that 
no syndrome is generated Hint is taken from the fact ihat there ie 
only one way of oorreoting any given error (Che oheok digits are 
used as indloators to point out whether any alternation made is the 
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ooxreotion reg.ulxed or not 

Of coarse It is possi'ble to generate a s^n^drome to correct 
single errors (as 6 11 Weinbecg has done), and perhaps itultlple 
errors also^ hut as experience has they do not heoome popular 

In information systems due to the large overhead of check digits 
Mother reason la that such conventional codes oannot ooxreot exchange 
errors The proposed code ooxreots both these errors^ whioh oonstitute 
perhaps 80 ^ of all ih.e errorSf ^th only two check digits The pnoe 
to he paid for this Is extra prooessing time on the computer The 
cost of this la probably oommensurate with the benefits AlsO| since 
this error oorreotion is to he done wMle reading, if we have a uni- 
programmed system or a multiprogratnmed system with an imperfect 
program mix, then there will always he a lot of idle CIU time between 
the reading of two records, and the extra time required for errorc* 
oorreoting will he completely transparent to theuaer 
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A CmPifflISION YHTB. OHE EXISTIEQ DATA MTOY METHOPS 

4 1 Overyiew - 

In the last chapter, an erier-eorreotin^ melhod vas developed 
It renelns to he seen how good a data entxy method using the errox" 
ocrxeoting feature will he, compared to the existing methods ^ers 
are many factors which vazy fxcm method to method So if a Quantitative 
Index of performance Is to he found then welghtages ox costs must he 
assigned to the factors Next, to Quantify each factor, a model la 
needed for each of the methods Ghe model should he Quantitative, 
simple, and It should bring out all iStis essential differences In the 
methods 

In this ehapber the essential factors are eDumacated, ihe 
Index of psrfoxmanoe Is defined Then one model Is developed for each 
of the methods, and e^gpressiotts for Ihe Index axe derived Finally 
'^he results ere presented 

4 2 Fojanglatlon of a model i- 

!Ehe following methods are ooneldered> - 

1 Keypunch « error correcting 

2 Keypunch - error detecting 

5 Keypunch - vezifylBg 

4 Key>tc*4;ape (and key •^O'dlsk) 
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Xhe main assumption s ave 

1 Xhe overall error rate is independent of the number of xeoordsf 
or any other factor 

2 Of the errors^ dO % are single transcription and single trans** 
position errors 

QThe essential parameters are: 

IT, the number of records 

e, the error rate 

L^, the total number of oharaoters in the important fields, for which 
strlot error controls sace justified 
the total number of oharaoters in unimportant fields 

0 , the number of important fields 

Lg, the number of oharaoters in the error ocntrol subfield, added to 
evezy important field 

It oan haisre values 

0 - Ho error control 

1 *• Brror deteotlng 

2 - SrxDX oorxeoting 

She length of a record is inoreaaed from 

due to tho error control fields, and the number of oharaoters to be 

entered, ftom K(li+ I<u) to H(Ij+ %+ o 1^) 
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^nie faotors whioh. should be derivable from the above paxame- 

ters axe 

Hi the number of "feedbaoks”! 1 e , number of times one has to go 
throu^ the data entry process In oriac to enter a batch of IT 
records 

8| the extra keystrokes required 
m| the extra mamal handling in terms of oharaoters 
t, tho extra time taken on the computer fox error detecting and/ox 
correcting 

With tiieee four factor s, four costs o^i o^i and o^ are assoolatedi 
so that the expression for total cost is 


C(i) 
where i » 

1 Keypunoh-^error eorreoting t 



F t Key punch 

EDO Error detecting end 
00 irection 

DB t Data base 
D f Delay 

Ne of than 
axe 


N records are pinched and fed to the oanputex 
axe erroneous Of them dC^ | i e t 4Ke/5 are corrected f and Ne/^ 


»nOjj+Bea + mo^+tet 
1i2|3f4 for the four methods 





sent 'baok to be pinobed throng the delay D The same xule applies 
to the Ne/3 records, and so on, until one record Is left So the 
total numher of records pinched are 
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... .T Se Sa*. Be” 

v -'s* Y * 


Where 


n 


is < 1 

5“ 


and 


Ne 


n-l 


=n -1 


> 1 


Solvlns» we get 


n » P (- log ll/log(e/5)) 

Henoe 

N' « N(l- (e/5)’^’*‘‘‘V(l-.(e/5)) 

The nunbcr of feedhaoks is n Each record actually contains I^+ Il^+ 2o 
oharaeters, sl noe two characters are to he added for eaoh of l^e o 
important fields, that ia, « 2 

The number of kqsrstrokes required is 

N(Li+ \+2o) 

But the actual number of keystrokes is 

»( Li + +2o )( 1-(e/5)”*^ )/(l- e/5 ) 
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So the extra keystroke s are 

S « N(l^+ Iiu+ 2o)((l- (e/5)**^^)/(l*»(e/5)*'l) 

Slnoe >■ 2) the extra manual handling Is m • 2IiTc 

[Ehe extra time taken on the oompiter Is the time for the 
detecting algorithm to operate on all ihe reoonis and the correcting 
algorithm on aU the erroneous reoocds dhe deteotion algorithm 
operates on 

IV » K(1- (e/5)“’*’^)/(l- 0 / 5 ) 
reooidsy and the oorreotlon algorithm on 


Ne + 




- Ne(l-(e/5f )/(l-e/5) 


reooris So the extra time taken on the computer is 
t - U(l-(e/5)*'*^) (time to detect error in 0 fields)/(l- e/ 5 ) 

+ He(l- (e/ 5 )^) (average time to correct one eripr)/(l- e/S) 
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2 ^ytwash-egaasr detect jng t~ 



P i Keypanoh 
ED Erxox'^eteoting 
BB t Bata 1)8186 
IV t Delay 


Here the aotaal leiigbh of a reooixL Is Lq Of slnee 

Since the error rate is e, ard no ooxreotion is Involved, the total 

number of reoozds actually punehed Is 


E* 


N + Ee + Ee* + 


rr ^ 

+ Ee 


where 

Ee“^ 1 and y 1, 

so that ihe mmiber of feedheoke 
n « P (- log N/log e) 

Henoe, 

E‘ « W(l-e®*'’V(l-e) 

The number of eactra keystrokes is 

S « N(L^+ Iiu+ e)((l- e’^*')/(l-e) - 1) 
Old the extra mamal handling is 


m M dff 
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oharecters The extra oompatex time la the tame to deteot erxox in 
the Ni xeoozds aotually fed, so that 

t a N(i- e*'^^)(time to detect errors In c f ields)/(l-«) 


3 Keyranoh- verifying - 



P t Keypineh 
V » Verifier 
DB t Bata hase 
B t Belay 


Exactly as in keypinoh-error detecting, 

H' - E(1- e*^‘’)/(i-«) 

where 

n - (“log U/log e) 

The mnvber of extra k^atrokes 

a . 8(1^+ 1,1) ((2(l-e“*'')/(l-e) - 1) 

since 0 

!the extra manual hand Bing and extra oompater time axe each 


eexo 





38 


4 Key-to-^ape - 

K t Kdybo eocd 

C Coo^arator (a program) 

DB Data Base 
D t Belay 

It is assamed that error rate e is same for Both the 
operators^ and that the errors do not overlap Also the differenoe in 
speeds of the two operators is not oonsidered> sxnce there Is usially 
a Buffer to tdce oace of that 

Eaoh operator panohes 
N + 2Jro + + 2Ne*^ 

records, so the total mmBer of reoords pinohed are 
K» » 2N + 4Ne + 4Ne^ + + 4N8® 

where 

2isre“ 4 1 and aUe*^"'* > 1 
flThe mmBer n is 

n ■ P (-log 2N/log e) 

The to tel iwmBer of keystrote s are 



N(I,^ lij{2 + 4e + 4«^+ + 4e*^) 
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. H(li+ )(4(l^’«-’)/(l-e)-2) 

Since l!if(l^+l4jj) strokes were actually reqnlredi the extra keystrols s 
are 

S - ir(\+%)(4(l-e*^^)/(l-e)-.3) 

trhe extea manual handlxng m s 0 Ihe extra computer time time 
to compare 

K + aSTe + + SNe*' 

pairs of records So 

t *» u(3(l»e^^)/(l-e)“l) (til© to compare two records) 

Now Idle esqpressions for oost 
0(x) » n o^+ s Og+ m c^ + t 
can he written down as 

C(l) w n + N(I(^+I»^+ 2 o)( (l-(e/5)*'*^(l~ e/j)-!) o^ + 2No 

+ (N(l-(e/5)“'*’^ (tin® to detect error in o fields)/(l- e/5) 

+ Ne(l- (e/5)**^^) (average time to oorreot one error )/(l“ e/s)) 

0(2) » n + N(Ij+ Ii^+o)((l- e”^^)/(l-e)-l) + He 

+ H(1- e*^^^)(tlme to detect errors in 0 fields)/(l-e) 
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0(3) - n + H(I^+ L„)(2(l.e"*1)/(l.e) -l) 

0(4) - n + K(V Ii^)(4(l-e”'’)/(l-e).3) o, +»(a(1-e”*')/(l-9)-1/ 

(time to oompate 2 reooxds) 

Por 1, 

n - P (-log ir/loe(e/5)) 

For 2 and 3» 

n - n(-log N/log e) 

For 4 

n « P (-log 2K/log e) 

Novr in moat of the applioationsf the extra manual handling 
does not oosb nuohf and the extra oompater time is not impoxbant sLnoe 
it is very small So it may "be assMued for simplioity that o^m ■ 0 
To effect another simplif aoatlon) let the pezoentage of extra keyatrohes 
he considered, Instead of the actual number of extra keystxolfi s Also, 
let <^**'10 ‘9-)^ Og " 0 implying that one extra faedhaok ooabs 
100 times more than one percent extra keyatrolea In fact the ratio 
o^/oq oould he muoh more Then 

C(l) - n +(l+ 2 o/(Li+L^ ))((1- (e/5)”*^)/(l- e/5)- 1) 

0(2) « nf (1 + o/(\+Lu e“‘*’^)/(l-e)-l) 
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0(3) * n + 1 «e) - *1 

0(4) - n + 4(1- e"^^^)/(l-e) - 3 

with the n in eaoh oase unchanged 

Example - 
let 

No of xeooxde, IT » 10,000 
Exxo r rate , e •• 1 ^ 

Length of impost ant flelda, L^^a 20 
Length of unimportant fields, L^ > 80 
No of important fx o a 2 

Then applying the above foxmalasi xou^ly 

C( 1 ) « 2 units 
0(2) * 3 units 
0(3) a 4 units 
0(4) a 4 units 

4 3 Results - 

The grains appended show the results L^ has teen 

taken to he IOO throu^out o is taken to he 1 and Z, and for each, 
e is taken to he 1 9^, , 10?^ and gcaphs ace plotted of cost va 

N, which varies from IOOO to 10,000 In steps of 1000 
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The oonolusioias £rom the gre^hs are 

1 The method mth the erro r-correciang code costs 3esa than each pf 
the others 

2 As the error rate goes up, the superiority of "ttie proposed method 
heoames more marked 

So it can he concluded, that vath reasonable assumptions, 
incorporation ef the proposed error correcting oode in a data entry 
method will decrease the overall oost 
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SEOJCGOY TOiWSPOlMAraONS 


5 1 OvegylOT f* 

This ohapber deals MTith the problem of seoxeoy txansfor- 
mablons In Infomatlon systems Uany files; especially In Integrated 
systems^contain sensitive data which mast be protected ^e problem 
bee COBS more severe in shared data bases In many applications; secret 
data has to be handled memally; end it becomes necessary to oode the 
data 

The question of restricting access to files has xeoeived 
some attention; but no simple and efficient means hoire been developed 
to oode data in oider to render it unintelligible This ohapter lists 
the requixemaits of seorecy transfonatlons; presents several simple 
algorithms for the purpose Ib.nal2y it give simple quantitative 
ns thods of comparing the transfonaitions in view of ihe requirements 

dhe concepts developed were applied to a live eysbmn; 
described in chapter 6 

5 2 Some proposed tratiefc mablons - 

The main xequlremmts oan be stated as below - 

1) The oode should be Invertible 

2 ) The coded number should bear as little relation to the original 
number as possible 
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j) ISiG algorithm ehouM be easily implementable 

4) The iransfo raatlon ahculd preserve all 'Qie error deteoting and 
oorxQotlng properties of the original number 

5) IThe oodo should be as difficult to break as possible 

6 ) Tho algorithm should suit the number rejuresentation of the 
original number 

7) The transfoimiatlon should have paremeters which can be easily 
altered without ohangLag the algonthn 

Which of the following algorithms should be ohoaen depends on the 00 st** 
effeottveness of Ihe particular situation For exemple a code suitable 
for military Information transmission may not be suited for postal 
transmissions} even though the same type cf information is transmitted 
in both the oases 

In the following algorithms} only numeric data of fixed 
length has heen considered} but extensions axe easy Transformations 
T are presented} 

Tdiere ^ 0, 1|2} * ^ ^ ^ 

could be digits, alphabets or other symbols 
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AlgPiJ-thm 1 Pezimtatlott j*. 

Here T ie an identity matrix of oniBr n ■with. Its oolumns 
pemutedi so -tdiat 

<®1 ®2 ®' n ') — ^*2 

wltli \ B some uniquely 

!Chls al^nthm does not require matrix nultlpUoation, and 
Is in fact} very simple, !Qie par sue ter Is the permutation sequencsi 
having nt different values All the error control features of the 
original numher are fully preserved If the BCD representation is 
used, a FOICERM program for this will require only six statements 

Algorithm 2 > Ohange of haae »- 

Here the base of the muober is changed from 10 to seme 
greater or smaller Integer If abase greater than 10 is chosen, thoi 
Instead of usmg A,B,C, for 10,11,12, , alrhabete and other symbols 

could be ehosen at random, thus making code breaking more diffioult 
l!hla transformation, however, may not preserve Ihe error detecting and 
oorreoting pmopertlea 

Algorithm 9*8 oomplement t"> 

tn this algorithm, lite digitnlse 9' a eomplement cf the 
number is talm Shis algorildam Is extremely simple both for tl» BOD 
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Eind Mnocy representations It preserves the error detecting and 
oorreotlng properties But this transfoimation has no B ereneters 

Algonthn 4 GeneraUied complementation - 

In this algorithm} the munber is subtracted from a 
constant larger than all the numbers xn the batch (Altematlve3yi 
a const ESit oaiU he added to or sahtr acted from the number) 

QMs has all the advantages of Algorithm bx & in addition* 
it has a parameter vhioh can be easily varied 

Algorithm 5 Aselgaing digita te 

a) Without table lookup »- 

Here every digit Is replaced iQr -jq ^® 3 ce k 

1 s any integer £com 1 to 9 

b) With table lookup i*- 

Here a ma>pping is stored in the fora of a teble* like 



10 I suoh mapidzigs are possible, and it isveoy eai^ to go from one to 
another 
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Both these preserve the error control propertLos 

Al»>nthBigt Asai/ming several oharaoters » - 

Here we have a one-to -many mapping, e g 

1 A Y Z 2 

2 — r PHD 9 , etc 

To trsnsfozm any digit, one out of ‘^ese 4 oharactera Is ehosen 
at random This serves to make code hreaking extremely difficult 
As a speoial oase we oould ha\re only one character instead of four, 

Algorithm 7 Random number generation i~ 

Thls requires a good raindom mmher generator which 
does not give repeated values For each nvunhex to he txenaformed 
in a hatch, a xendom xumhex is generated, end a table of the 
original numhex and the random number is maintained (Oil a la 
ueoessary heoeuse Ihe algorlthn is not invertihle Its main 
advantage Is idiat tiiexe Is absolutely no ircy to break the code, 
for the simple reason that there la no oode It requires an 
appreoiahle time on the computer 

Algorithm A number dependaat oodlng t- 

If the number has n digits, choose an integer k, 

1 ' k ^ n Then the algoritlm is 
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10 

1 k 

HerO) effeotiveHy, the algorithm changes from number to number 
In sddltictt, k can be varied It preserves the error control 
propertl es 

Algorithm 9* Humber dependent rotation - 
Find 

1 “ 'I 
- 1 ',^ . 

and rotate the number right or left by K places 

Here also I effectively, the algorithm is number dependent 
In general, any symmetrio function f(a^,ag, ,a^) idiioh gives 
Integer vahios Ossn be used 

Algorl-tam 10» Several algorithms in series 

Since each transforDsbion is invertible, any number of 
algoalthms eould be applied one after the other 

Al^ritbm 11 1 Several algor Itbma narallsl t- 

Xhe n digit number oculd be divided arbitrarily into 
subflelde, and different algorithms could be applied to the different 
fields ![he subfields need not contain oonsequltlve digits 
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In Hthe most sophlstioated oaBe^ we could have a series- 
parallel atruoture of algonthma, e g t shown "below 



5 3 Evaluation of ta?ansfonnatlon8 t~ 

Since there are many requirements of aeoreoy trans- 
formatlonsi and apeoifxo algorithms satisfy them to a varying 
degree, a quoartltatlw mepsure Is reqxxlred to properly evaluate 
them 

One o'bvlous way la to define an Index 


“ 1^1 




+ Cy,f„ 

n n 


udiere ^ factors Involved, and e^iO^,, 

axe the costs associated ^e factors could he time taken, storage 
required, retention of error oontaiol properties, etc But it Is 
not f eadhle to employ thi s method practlo ally, elno^^l the 
factors cannot "be easily quantified 

However, given any two algorithms, it is relatively easy 
to say which factor is more ftw curable in one of them compared to 
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the othei? Based on this fact} a aimple soheoe xs devloed 

First} n factors ere idontified The set of algorithms 
A^i considered For sny two algorithms and 

an n^imenelonal oompansion vector 0 Is defined 

“ - C»1 «2 »n] 

Here Oj^ « 1 if the kth factor of is Tjetter' than the kth 

is tetter thaat A^} 

If 0 for and contains more 1‘s than 0* a 

V/ith the above oomp'vision the set of algorithms ean 

12 n 

be sorted to give the sequenoe t f \ f i A « which are in 
decreasing order of "goodness' 

If a weigd^tage could be given to every factor} thon that 
eould be used instead of 1 and 0 


factor of Ag Otherwise it is zero Now A^ 



CHAPTER 6 


I CASE STORY 


6 1 Oyoiyylew t- 

!Ch.e aim of this oase atacly la to demonstrate the asefU.1^ 
ness of the data eranr oontrols and seoreoy transformations deso3d.bed 
In the ixrsTiaua ohapters !Che oase chosen was 'tiie data prooeesing 
of the Joint Entranoe Exminations for entrance to the fxve IITJ's 
and B H U It vas chosen because it hxghUghts the two main areas 
which fom the subject matter of this thesis 

After a description of the eiystemi the two aspects Ydll 
he dealt with Betalled description of the other aspects la omitted 
because they axe not of direct interest here 

6 2 D escription of the aystero t - 

Every yeaCf one of the IIICs is put in charge of the 
data processing activity That XIT takes the responsihillty, develops 
the programs (This is done every year ^ ) processes all the data^ 
and produces the final admission lists In the year 1974 -•75» II3? 
yffipir was put in charge A generalized system was developed which 
can he used In the suhsequent years^ irrespective of the IIT in charge 
and ths computer used 

I 1 T K/^M?UR 

CEmem^ umf^RY 

^"*^^‘“^*'**^^** ' 1^1 Sin* 

Acc No A 30284 
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The steps in the system are outlined helow (the dates/months 
are omitted) 

1 /jivertiseraents appear in the major ne\/^spapers The applicants 
request for application formsf get then, fill than, and send them 
to the I I T in who« sone they wish to appear for the examination. 
irreapeotive of the IIT whioh they wish to join 

2 At e-uh of the lITs, applications are sorted manually first 
examinaidon oenterwise, and then gcoupnise 

fiegistration numhers axe allotted as follows 


IIT Bombay 

X 

10001 

to 

19999 

IIT Delhi 

1 

20001 

to 

29999 

IIT Kanpur 

% 

50001 

to 

39999 

IIT Kharagpurt 4^01 

to 

49999 

IIT Uadras 

s 

50001 

to 

59999 


Eaeh IIT allots registration numhers to its centers in groups of 100 
Suffiolent gap is left out between the last registration numhex of 
the group A oesodldatea and the first registration number of the group 
B osndldates for the same center, and also between two centers This 
is to meke additions possible 

Example «** 

Eor Ahmed abed center, suppose there axs 107 candidates from 
group and % from group B Then the center gets the slide 1001 to 



55 


10200 Group A oandldates ace given mmliers 10001 to 10107 

and group B, 10151 to 10184 

Ihe next oenteri Rajkot t ^ts rumliers atsxtlng £xcm 

10201 

4 Three oopLee of the oaiter'^se and group-wise namhering 

sohwoa evolved at -file IITa, as given in (^) above, axe sent 
to the IIT in charge 

3 At oaoh IIT, the basic information sheet Is eeperated from 
every ppplloation form, completed if neoessozy and posable, 
sorutinlzed, and oards axe punohed from it The scrutinizing 
and pu33ching can go on slnultaneously Bach IIT has to buy 
its ovm requirements of oards 

6 The oards are packed in separate bundles for eadi center end 
group, sorted in the ascending order of registration numbers 
Oonterwise and groupvise listings of the cards are made in 
dupldoato, and a final sorutiny is made 

7 The oards, arranged as in (6), together with one oopy of the 
listing is sent to the IIT in ohqrge, along rdth the three 
ooplcs of the numbering soheme, as in ( 4 ), through an suidiorlsed 
repre smt atl ve 

8 The computer center at Ihe HT in oheorge loads the oards onto a 
tape The errors disoovered are corrected in consultation with 
the representatives of the IITe Pinally a verifioaHon list 
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is produced which as oertifxed by the r&p^^esentatlves 
Tour copies of roll lasts are i®epaxed Each oontains the 
.^ppUocnts registrotioii numberi nsmey end space for him or 
her to sign, at the time of each examination Three copies 
are s^t to the IXTs through the representatives One is 
retained at the IlTs^ one Is sent hy post to the ptre siding 
officer of eatAi examination center^ end the last one Is 
token to the center hy a representative 

9 Kextf ooding-cum -tabulation sheets are pcepaxed There are 
printed on nmltt-part oontimoue pre-^rinted stationery 

The first part oontains the applaonnt s registration number ^ 
nome and the secret code generated ligr the eompater IQien 
there is a part for each subject containing tti© code and spase 
for writing the marks At ihe bottexm of each page, there is 
space for writing the total of marks (for each subject) 

These forms are printed, sealed, aM sent to the Chairman 
of the admissions committee of each IIT, through responsible 
persons They are opened cnly at the time of transcribing the 
codes onto the ansver soxipts/ 

10 After liie examinations, ihe answer scripts, along wilii the 
xoU lists bearing the signatures of the oandidatea who were 
present, are brought to the respective XlTs, through responsible 
persons The oode-cum-tabulatlon sheets ere opened, end the 
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codt transoribere enter the codes at tvo places on each 
nnsv7b.r hook One of them contain a the applicant's name and 
registration number ffhich is tom off | and the other is 
retained on the answer script which is sent to the examiners 
The part tom off, together with the first part of the code- 
oum -tabulation sheets, is stored in a secure place hy the 
Chairman Before transorihing it is ensured that the name on 
the answer scripts and the tabulation sheet, and also ihe two 
rogistr tlon munhexs, are identical All this is done for 
onswer scripts of each suh^eot, aid they are then, sorted in 
the order In which they appear in the tabu loEfc ion sheets 

11 The answer sexlptB, In the older mentioned in (10), together 
with the part of the talmlataon dieet for that suhjeot are 
-sent to the heed examiner He distributes them to the 
03 miners who, after evaluation, enter marks against the code 
numbers They also total the maika for every page Then 1he 
sorutlnlzers look up the marks cm the answer sheets, and enter 
them against the oode numbers of a second copy of -the code -cum- 
twbulation sheets, independently The two lists are compared, 
and the errors are oorreoted The answer books, together wiih 
two sets of morksheeta signed by the escamners and soxutlnlzers 
axQ sent to the Chairman ' 
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12 The parts for each sulijeot 1)earinf 'tiie same pa^ numter ecce 

stgreft together with oello tape and the form Is reconstructed 

The tv7o copies ore tallied and the information is transorihed 

is 

to the tiiird copy The copj^soit to the IIT in charge 

13 C jx ^ are pinched at the IIT m charge The computer prepares 
merit lists seperately for SO, ST and other candidates, irre- 
sieotiYe of whether the applicants belong to group A or B Also 
aonal merit lists are produced Cards are punched only for 
those SC/ST candidates who have appeared for all the examinations 
caid the other candidates who have secured at lecst 25 /) in 
English and ^0 in the other subjects For preparing the merit 
lists, English marks are not eonsidored In case of a tie, marks 
in Mathematics axe considered If still the tie persists, mEcrks 
in Physios for group A and in Physios end Chemistry for group B 
candidates ere considered If the tie still persists, same 
ranking is gLven, but one is sdded to the next rank 

^ - If two candidates eae tied up and one gets the rank 174) 
then the other will also get 174 ) but the next candidate will 
get the rank 176 , not 175 

14 A meeUng of the chairman of the admissions committee decide the 
out off point for calling candidates f<ir interview They take 
the lists and go back to the IITs where zonal merit lists axe 
prepared manually The two lists are tallied and the errors 


oorreoted 
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1|5 Interview letters axe sent to the candidatesi and after the 
Interviews! final seleotlcaxs are made 

For last minute phanges In roll lists on aooount of 
ehanges In oenter or groupt or addition of new oandidateSf 
new registration numbers are assigned! and codes are assigned 
from a coding table sent to the ohalman of the admissions 
oommlttees 

Some parts of the system! such as getting the question 
papers, are secret, and cannot be revealed Anyway, ihey are 
out of the purview of the oompiter data prooeseing 

Seoreoy transformations • 

Ihs registration number is numena and has ^ digits Xhe 
first digit Can take 5 Talaes a^epresenting the 5 HS^s, as followss 

1 IIT Bombay 

2 m Delhi 

3 IIT Kanpur and B H U 

4 IXT Kharagpur 

5 IIT Madras 

The other four digits are not oontinuous, since for every center, the 
number begins with the next hundreds place (in order to leave gaps 
for emergenoy registrations, trsnefers, eto ) 

It was required to transform this to a 4 ‘Character 
alihauamerlo code (first two aliiia, and the last tv o numeric) sueh 
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that the Identity of ihe IITs vras preserved The codes irere to he 
generated by Ihe oompntex The following algorithm was chosen. 

The first dxgit was trsnsformed as 

1 A, B, 0, or B 

2 ' E, y, Of or H 

3 ^ K, L, orM 

4 N, P, Q or R 

5 ^ s, T, U ox V 

which of these 4 characters are to he used is decided by the next two 
digits 

Twenty five different aljiiahets in a random order are chosen 
(QUICK ) The aeoaid and third digits oan have values 00 to 99 
Twenty five of these hundred are assooiated with one of the four digits 

Eg - For IIT Bombay, 

lOixic to 129xx >4.-^ Aboex 
I 26xic to 15Qxx —4? Baxx 
15 l 3 cx to 1753CX Oaxx 
IfOtxx to 200xx Daxx 

where a is one of the 29 alphabets QUICK , and xx represent the last 
two digits 

Coding for the last two digits is one by a one-one into 


mapping 
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Eg - The registration num'ber IO 199 Isecoaea AQ02) since the mappings 
are 

101 AQ 

11 — ^ oe 


A one*ipage "FOWRUSi sabroatine performs the transformation both vsifB 

Error control - 

1 Abaost every manual ta^ is done twloe and the remxlts are tallied 
The errors are oorreoted mamally 

2 In the roll’^m*^abalation sheets « after the marks are entered and 

thoy are stuck together ly cello tape, the marks in every xoir 
eoce added The mazks in eveiy column are already added hy the 

exESDiners The grand total of both these types of totals should 

tally, for every page 

5 The registration numbers are assigned aeauentially Vtoen they are 
brou^t to the IXT in charge th^ axe fed to an edit psogoam Tho 
program reads the oards and discovers the missing and duplLoate 
cards Some of the errors can be corrected at once For example, 
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if tho gcoup 3.S panelled wrongly, say Bis punched for A, thai this 
oan he found from the range in which the registration numher lies 
01|P , if one registration number does not confirm to the seq.uaioe,e g , 

, 10123, 10124, 10124, 10525, 10126, 

T 

then the 5 can be corrected and put aa '1 ' 

But Inspdte of all these ohecks and oonxeotlons, it has been 
observed that there axe some errors \dii6h cannot be corrected Then 
the basio infacmation ^eets of the applieation forms hesre to be 
consulted Since altogether there axe about 20,000 applicants every 
year, those many sheets cannot be brou^t to the XCT in oharge Moreover, 
aulos do not pemit moving them So in order to correct Ihese errors, 
the representatives have to go back to their IITs jhd since the 
e-^aminations have to be held on schedule, this creates pcoblems 

So it IS not a inostion <f 9UBt detecting Ihe errors, they 
have to be oorreotod The error*^orreoting sohane is required only fox 
registrstion numbers Errors in the name field oan be tolerated and 
errors in the group field can be corrected This leaves only the 
field speolf^ng SO/ST Since their number la small thLs can be oheoted 
manually, even after the examinations 

The registration number will hove to be increased ly two 
digits The sohmiie desoxlbed in the earlier chapters oan be directly 
used The weights are as belowt 
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m 

6 

5 

4 

3 

2 

1 


’ 

9 

3 

7 

8 

5 

2 

1 


8 

9 

6 

1 

7 

5 

2 


The coding, erzor deteotlng end oorreoting algorithms 
are exactly the same 

In sonoluslon, It may he said that It Is imperative to 
oorreot the errors In time, heoauae even one error on the 20,000 
records may mean ruining the career of a good student 







CHAPTER 7 


CONCIDSION 


Thou^ Infoiaiation systems have psoliferated videly in 
reocnt yearS| end topics Ul^e system anelysis end desi-ghf mana;ge» 
ment of project Sf have received mde attmtioni the control aspect 
has hoen neglected^ and unjustifiehly so In the present vorky two 
aspecta of this problem y nemely data error control and secrecy 
transformations) have been looked into 

It vas found that input data editing proeeduros were 
naive and inade^ate They tend to inoreaae the overall time for 
entering a batoh of data) and necessitate many more keyatzokos than 
actually required) espeoially if the error rato is hi^ It vras alao 
obaerved that there does not exist any error correoting code ^loh 
can be profits ly incorporated into information systems to solve 
this problem A simple and elegent error correct mg code waa 
invented \7hioh vrpa suitable for the purpose With the help of a 
modv^l it Was shovn that data entry methods having this feature will 
do better than the existing methods) under reasonable assomptiona 

In the area of aeoreoy tranaformatLonS) the need was for 
simple and efflol(»it oodes vhioh at the aame time should bo difficult 
to broak Several baaio algorithms were presented) and it was 
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indicatod how codes of any desired degcee of sophlstieation em he 
'built from them hV series'^arallel oomhlnatlotis In order to choose 
ohjeotlvely from atoonght the oodes^ a simple technique was evolved 'to 
eompGxe the oode^ taking note of 'iihe fact 'hiat many of the factors 
involved are sab;}eotlve 

After a brief survey of the various application oreas^ 
it IS felt that for any file, the field oan eoonomioally have the 
error -oorreoting oode Also^ for suoh other important fields like 
oash, inventory level, etc , Introduction of the error-correcting 
code could turn out to be profitable, dep^ding upon toe aotual costs 
In some areas like defence informa'tLon systems, medical information 
systems, npaoeoraft control where errors cannot be tolerated and turn- 
around 'time is orltieal, the error oorxeoting code oould be veiy useful 

!Zhe seoreoy transformations developed oan find reedy 
applloction in defence intelligenoe, examination data processing, 
shared direct aocesa files used competing business oonoeixis, 
oompany fl^les containing personnel evaluation, bank information systems 
containing 'toe credit -worthineBs of oustomers, large national data 
banks, eto 
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